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ABSTRACT 

To facilitate locating archaeological sites before they are compromised or destroyed, we are developing approaches 
for generating maps of probable archaeological sites, through detecting subtle anomalies in vegetative cover, soil 
chemistry, and soil moisture by analyzing remotely sensed data from multiple sources. We previously reported 
some success in this effort with a statistical analysis of slope, radar, and Ikonos data (including tasseled cap 
and NDVI transforms) with Student’s t-test. We report here on new developments in our work, performing an 
analysis of 8-band multispectral Worldview-2 data. The Worldview-2 analysis begins by computing medians and 
median absolute deviations for the pixels in various annuli around each site of interest on the 28 band difference 
ratios. We then use principle components analysis followed by linear discriminant analysis to train a classifier 
which assigns a posterior probability that a location is an archaeological site. We tested the procedure using 
leave-one-out cross validation with a second leave-one-out step to choose parameters on a 9,859x23,000 subset 
of the WorldView-2 data over the western portion of Ft. Irwin, CA, USA. We used 100 known non-sites and 
trained one classifier for lithic sites (n=33) and one classifier for habitation sites (n=16). We then analyzed 
convex combinations of scores from the Archaeological Predictive Model (APM) and our scores. We found that 
that the combined scores had a higher area under the ROC curve than either individual method, indicating that 
including WorldView-2 data in analysis improved the predictive power of the provided APM. 
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1. INTRODUCTION 

Archaeological sites are being destroyed or compromised at a catastrophic rate in most regions of the world in 
this age of globalization. Populations are growing much more rapidly in developing countries than in developed 
countries. These populations are migrating to areas that in the past have been affected very little by human 
activities. As China, India, Brazil, and other countries aspire to levels of consumption enjoyed by the United 
States and Europe, large areas are cleared for agriculture and communities are springing up around mines, 
oil wells, and newly established industries. At the same time, tourism has been seized upon by developing 
countries because it holds the promise of jobs for which high levels of education and training are not necessary. 
If archaeological sites are not destroyed outright by construction of hotels, transportation systems, and other 
visitor amenities, the increasing numbers of people that inhabit or travel through heretofore isolated areas results 
in the discovery of archaeological sites not by archaeologists, but by indigenous populations. Site looters who are 
tied into the global, underground, and illicit trade of antiquities seek out members of indigenous communities 
with such knowledge and exploit them. For example, a Khmer statue that might have been obtained for a bowl 
of rice from a Cambodian living in Siem Reap might fetch thousands of dollars in one of the many antiquity 
boutiques in Bangkok. The only effective solution to all of the problems listed above is to find archaeological 
sites and landscapes before they are overtaken by agricultural and industrial development or found by looters, 
and then to take steps to protect them. 

Detecting archaeological sites by the analysis of imagery acquired by airborne and spaceborne sensors is 
often possible because human activities hundreds or even thousands of years ago has frequently caused localized 
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environmental changes that have persisted to the present. These can be the addition of humanly constructed 
features to the landscape that display a regular geometry and so are anomalous to their surroundings. Such 
features include houses, temples, causeways, roads, trails, agricultural terraces, reservoirs, canals, and adminis- 
trative structures. Human activities also produce changes in soil chemistry that can be detected directly, or that 
cause subtle anomalies in the vegetative cover because certain plant species flourish in or are stressed by altered 
soils. Non-native vegetation introduced in ancient times for food or for aesthetic purposes sometimes persists for 
great periods of time. 

Comer and BlonW 3 developed a means of utilizing aerial and satellite remote sensing data to identify probable 
locations of archaeological sites over a wide area based on detecting the subtle anomalies in vegetative cover 
described above. The remote sensing data utilized included SAR (synthetic aperture radar) data collected by 
the GeoSAR (airborne synthetic aperture radar) sensor, 11-bit multiple band optical data from the IKONOS 
sensor (operated by GeoEye, Inc.). A DEM (digital elevation model) was derived from the SAR data from which 
slope data was computed. These data, along with features computed from the IKONOS data served as input 
image layers for the anomaly detection approach. 

We report here on new developments in our work, performing linear discriminant analysis on 28 band difference 
ratios computed from 8-band multispectral Worldview-2 data. We first describe both the remotely sensed data 
and the archaeological site data that was utilized for training and testing our analysis approach. Then we 
elaborated on our new analysis approach, and the demonstrate the effectiveness of our approach over a test site 
over the western portion of Ft. Irwin, CA, USA and compare our results with an Archaeological Predictive 
Model (APM). 


2. DESCRIPTION OF THE DATA 

In this section we describe the remotely sensed data we utilized in our study, along with the archaeological site 
data that we utilize for training and testing our analysis approach. 

2.1 The Remotely Sensed Data 

The remotely sensed data employed in this study was multispectral WorldView-2 data. DigitalGlobe, Inc., 
Longmont, CO, USA (http://www.digitalglobe.com) contributed 14 swaths of WorldView-2 satellite imagery 
data to our study covering the entire land area of the China Lake and Ft. Irwin military reservations in 
California, USA. We report on our use of just one of these satellite imagery data swaths in this study, which 
covers a large part of the western portion of Ft. Irwin. This data was collected on December 30, 2010. While 
DigitalGlobe provided both panchromatic and multispectral data, we utilized only the multispectral data in our 
study. Table I shows the spectral bands provided by the WorldView-2 satellite. This data is orthorectified to 
2-meter ground resolution and has 11-bit radiometric resolution (stored as 16-bit integers). 


Table 1: WorldView-2 spectral bands (from DigitalGlobe Core Imagery Product Guide available on their website. 


Band 

Lower Band Edge (nm) 

Center Wavelength (nm) 

Upper Band Edge (nm) 

Coastal Blue 

396 

427 

458 

Blue 

442 

478 

515 

Green 

506 

546 

586 

Yellow 

584 

608 

632 

Red 

624 

659 

694 

Red Edge 

699 

724 

749 

Near Infrared 1 

765 

833 

901 

Near Infrared 2 

856 

949 

1043 


DigitalGlobe provided this swath of data in 45 separate 4096x4096 pixel blocks of data, which we mosaicked 
together to form an eight spectral band data set with 9859 columns and 61098 rows. We used in our analysis a 
9859 column by 23000 row subset of this data that covered the grounds of a western portion of Ft. Irwin. All 
locations outside of the Ft. Irwin land area boundary were masked out. 




Based on the suggestion of G. Marchisio of DigitalGlobe, 4 we utilized all possible combinations of band 
difference ratios of the eight multispectral bands as features for our study. Band difference ratios offer some 
robustness to changes in lighting conditions at different times of data collection, which will help in our eventual 
extension of our analysis of this portion of Ft. Irwin to other portions of Ft. Irwin and to the nearby China 
Lake site. These band difference ratio features are defined as 


Bi — B ; 

Bi-j = Vi > j. 

3 Bi +Bn’ J 


( 1 ) 


2.2 The Archaeological Site Data 

The archaeological site location data was stored in ESRI shapele (geospatial vector data) format with sites 
designated as Lithic Scatter or Habitation and Quarry. The tests in this study were performed using 62 Lithic 
Scatter sites and 22 Habitation sites. Precise for these sites were collected from October, 2010 to January, 2011 
by revisiting and evaluating sites that had been recorded over the past 50 years at Fort Irwin, California, which is 
located in the Mojave Desert. Fieldwork was supervised by author Douglas C. Comer with a crew of professional 
archaeologists. 

2.3 The Archaeological Predictive Model 

The site detection protocols described in this paper are properly thought of as an approach that is different 
in kind from an approach that has been used in archaeology for several decades now to predict areas in which 
archaeological sites are likely to be found. Such archaeological predictive models (APMs) do not associate certain 
returns from aerial and satellite remote sensing devices with the locations of archaeological sites, as we do in this 
paper. Instead, they are intended to identify areas within which sites might be found, rather than finding the 
sites themselves. That is, APMs predict on the basis of possibilism instead of probablism. Being possibilistic, 
APMs deal only with how suitable an area is for a certain activity. It is important to note that not all suitable 
areas will be used. For this reason, APMs have had limited success in predicting the locations of archaeological 
sites. 

Nonetheless, they have had some success, and are the standard against which new the results of new ap- 
proaches can be compared. Also, in the future, some of the approaches and technologies that we have developed 
might very well be used to improve the performance of APMs. For this reason, we utilized an APM that had 
been developed for one of our test areas, Fort Irwin. It was produced by the University of Illinois at Urbana- 
Champaigns CIS and Geospatial Lab, through the U. S. Army Corps of Engineers Construction Engineering 
Research Laboratory (CERL) in Champaign, Illinois. 5 

Utilizing a variety of statistical techniques, Ruiz attempted to develop associations between the following 
environmental factors and the presence of archaeological sites: 

• Distance to spring or lake 

• Soil surface texture 

• Soil water content 

• Depth to bedrock 

• Landform type 

• Elevation 

• Slope 

• Direction of slope (aspect) 



Fort Irwin; Central Corridor and Eastern Expansion 
Habitation Sites shown with Predictive Model 



Figure 1: The Archaeological Predictive Model for Habitation Sites 


While the statistical treatments that were used to develop the APM for Fort Irwin are not well described, 
the performance of the resulting model was good as measured by the metric of gain, which has been devised for 
this purpose. The formula used to calculate the gain statistic is: 1 — % ° ^ - 6 

The APM for Fort Irwin was actually composed of several APMs, each predicting areas in which different 
types of archaeological sites might be found. We will consider here only the APMs done for only two types of 
sites: habitation and lithic. An APM for these habitation sites is seen in Fig 1. Areas in green are those which 
are predicted to be most favorable for the presence of archaeological sites, red the least, and colors between 
those two in order of decreasing likelihood. For the highest class in the APMs (the area on the final favorability 
map that is colored dark green to indicate that the area is the most favorable for the type of site in question) 
the gain statistic is 0.86 for habitation sites. For the green area in the Lithic APM, a gain statistic of 0.67 
was calculated. ROC curves presented elsewhere in this paper compare the performances of APMs and the site 
detection protocols that we have developed. It is significant that the two approaches can be combined to produce 
a better performance than could be obtained by either alone. 

3. ANALYSIS APPROACH 

In this work, we present a framework to classify whether a location is likely to be an archaeological site based 
on multi-spectral image analysis. Image processing is first used to form a quantitative representation of the 
location. Secondly, feature extraction is used on this image processed representation to form a low dimensional 
representation of the location. The final step is to classify this low dimensional representation to determine 
whether or not the location is suitable for an archaeological site. In order to build this classifier, appropriate 



training data and classification techniques must be used. We validated this method by comparing with the 
archaeology predictive model (APM). By considering combined classifiers using the APM and the WorldView 
analysis we can demonstrate that our method provides additional discrimination power over the APM. 

In this section, we present the statistical setting and a general framework for image processing, dimensionality 
reduction and classification and conclude with a discussion of our parameter selection and validation technique. 

3.1 Statistical Setting 

We consider a multi-spectral image as B random images. Formally, let w, h, and B be positive integers. Let 
S = {1, 2, . . . , id} x {1,2,..., h}. Each element s = (si, S 2 ) G S corresponds to a specific geographical location, 
such as latitude and longitude coordinates. A feature image B £ K u,xft is a random image which maps each 
location to an intensity value. A multispectral image is a H-tuple of bands {B^ l \B^ 2 \ . . . ,8^) € . 

The value of band b € {1,2 at location s € S is denoted B[ b \ We assume that the bands are fully 
observed. Note that the bands in the multi-spectral image need not be the original bands but could be derived 
or preprocessed bands, such as the band difference ratio features. 

In addition to the bands, we suppose that each site location is given a class label in {0, 1, ... , K — 1} which is 
encoded in the random image y € {0, 1, . . . , K — l}“’ x/! for some positive integer K. We take K = 2 and denote 
the class label at location s by y s . Class label y s = 1 denotes that location s is considered to have archaeological 
significance. Class label 34 = 0 denotes that site s has no archaeological significance. 

All together, we have a (B + l)-tuple (Bl 1 -* , B ^ , . . . , B ^ , 30 which is distributed according to some unknown 
joint distribution. The bands, the pixels within each band, and the class labels will all depend on each other. 
The unknown parameters for this distribution will be determined by the geographical formations, environmental 
and historical conditions, the sensor used to collect the image, as well as many other factors. Due to these 
complications, we do not attempt to estimate its parameters directly. 

Finally, we suppose that there are n locations for which ground truth has been gathered. Formally, there are n 
locations s^ 2 \ . . . , s ^ € S and for each location we have observed the class label 34(0 • These locations 
can be used for training our method and testing our classifier against the ground truth. All other class labels 
are un-observered. 

We consider the classification algorithm as consisting of three components the image processor, feature 
extractor and classifier, determined by the functions /, g, and h: 

f :S x (R wxh ) B i-^R J , (2) 

g : R^ — » R d with d < d (3) 

h : R d — >■ {0, 1} (4) 

The classification algorithm is then given by the composition h o g o f. Each of the functions /, g, and h will 
be chosen from some class and may depend on the sites with observed class labels. Indeed, our paper considers 
using statistics on annuli centered at the site for image processing, PCA for feature extraction and LDA for 
classification. Figure 2 provides a diagrammatic representation of this protocol. 

3.2 Image Processing 

The image processing step transforms the bands into statistics about each location. We formally define an image 
processor as a function 

f :S x {R wxh ) B (5) 

that takes a location and B image bands and returns a vector. This function returns local statistics about 
the image for each location. For location s € S let X s = /(s, B^\ . . . , B^) and for location sW fot = 



Site 



Figure 2: A diagrammatic representation of our protocol. 


For this work, we suppose that 


f(s,B^,...,B^) 


//i(s,£ (1) )\ 

/ 2 (s,£ (2) ) 

\f B (s,B^)J 


£R J , Vs,£ (1) ,...,£ (B) 


( 6 ) 


where fb'-Sx M. wxh i— >• R d / S depends only on band b for each b. (Clearly, we assume d is an integer multiple of 
B.) Further we assume that all the functions fb are identical, so that the same statistics are computed for each 
image. These assumptions may not be appropriate in some instances but is a useful simplification in the current 
setting. 

In this paper, we use a particular imaging processing method based on statistics of pixels in annuli centered 
at each site. The image processor was motivated by the observation that the archaeological sites are typically 
circular and of relatively consistent sizes. 

Let 

A 8 (r( in \r(°“ t )) ={s'e5: r (in) < \\a - s'|| < A out) } (7) 

be the set of coordinates in the annulus centered at s £ S with inner radius A m ' > and outer radius A out \ We 
define u:S x R wxh 1-4 R 30 and <5 : 5 x R wXh ^ R 30 by 


Vi{x, B) = median{B x , : x' £ A x {r^ n \ r^° ut) )} (8) 

Si{x , B) = MAD{6^ : x' £ A x {r^ n \ r-°“ 4) )} 

= median^Ba,/ - Vi(x)\ : x' £ A x (rj m) ,rj 0 " 4) )} 


for * £ {1,2,..., 30} where MAD denotes the sample median absolute deviation. 
The inner and outer radii ranges we consider are given in this table: 


i 

1 

2 

3 .. 

. 10 

11 

12 

13 . 

. 20 

21 

22 

23 .. 

. 30 

r (*«) 

0 

3 

6 .. 

. 27 

0 

5 

10 . 

. 45 

0 

7 

14 .. 

. 63 

r (out) 

2 

5 

8 .. 

. 29 

4 

9 

14 . 

. 49 

6 

13 

20 .. 

. 69 


Since the annuli overlap and cover a relatively large area, the statistics for these ranges will likely contain some 
redundant information so the feature extraction step will be useful to reduce this redundancy while maintaining 
the components that discriminate between classes. 

We now define the image processing function fb : S x R wxh h-> R 2 ' 30 by fb(x,B) = [v(x, B) T , S(x, B) T ] T . As 
in Eqn. 6, this induces the function f : S x R wxhxB R 2 ' 3 o b 






3.3 Feature Extraction 


Each location is now represented by a vector in R d (with d = 605). The dimension d is much larger than the 
sample size n. This leads to classification difficulty because not enough data is available to build a reasonable 
model. Such high dimensionality can also result in complicated computational issues and reduces classification 
accuracy. This problem is termed as “the curse of dimensionality” . ' ’ 8 Feature extraction for lowering the 
dimensions in feature space is thus necessary. It eliminates irrelevant features, reduces noise in the data set. 
Moreover it converts high dimensional feature space to a lower dimension that captures most variation of the 
original data. Mathematically, we define g as the following: 


g : R d — » R. with d < d 

Note that g depends on the labeled data {Xi,Y\) i ..., (X n ,Y n ). We denote the feature extracted vectors 
for location s € S as X s — g(X s ) = (/og)(s,BW,...,SW) and for location sW as X W = g(X^) = (/ o 

5 )( S W,5«,..., £(”)). 

In this paper, principal components analysis (PCA) is used for feature extraction. Suppose the sample mean 
of the vectors Xi,...,X n is zero; otherwise, we centralize the vectors by subtracting its mean. Let £ £ R dxd be 
the sample covariance matrix corresponding to the data matrix D whose rows are data instances and columns 
are features. Note that the covariance matrix is positive semi- definite; thus it has all but negative eigenvalues 
and its eigenvectors form an orthonormal space. Let U £ R dxd be a matrix whose columns are orthonormal 
eigenvectors corresponding to the d largest eigenvalues of £. The feature extractor g of PCA is defined by 
g(x) = U T x £ R d for all x £ R d . Hence, the first principal component, a linear combination of the original 
features, is the dimension with the most variance in the original feature space; the second principal component, 
orthogonal to the first, captures second most variation of the data set and so on.... Usually, a small fraction of 
the total dimension is needed via PCA. To select the PCA dimension d we use cross validation (see section 3.5). 


3.4 Classification 

We define a classifier as a function h : R d i— >• {0, 1}. After image processing and feature extraction, h assigns a 
class label to each data vector on the lower dimension space. 

The simple classifier we choose is linear discriminant analysis (LDA). LDA is the Bayes plug-in classifier under 
the assumption that the data is distributed according to class conditional multivariate normal distributions with 
the same covariance matrices. We estimate the means and covariance by 


dv = 


£ x 

i:Y( i )=y 


and £ = — - Mix® _ , 

*= 1 j =4 


( 10 ) 


respectively. We then have the estimate of the posterior probability that a site with feature extracted vector x 
is from class 1 defined by 


i)(x) = P(y = 1\X = x) 




/A,£) + f 4>(x; /io,£)’ 


( 11 ) 


where n y is the number of training samples with class label y and (j>{-\ /z, £) is the density for a multivariate 
normal distribution with mean /i and covariance £. 

The classifier h is defined as h(x) = I{f)(x) > r} for some threshold value r £ R. 



3.5 Parameter Selection and Validation 


The key parameters we must select for this algorithm are the PCA dimension d and the threshold parameter r. 
The threshold value r will depend on how the classifier is being used so we do not attempt to explicitly select a 
threshold but consider a range of thresholds. We use leave-one-out cross validation to select the PCA dimension. 

Leave-one-out cross validation is a technique to estimate the error of probability of a classifier that is trained 
on data. The procedure loops through each data sample, removing it and training on the remaining data. We 
then evaluate the resulting classifier on the removed sample and the estimated error rate is then the percent of 
the samples that were incorrectly classified. This results in a more accurate estimate of the generalization error 
of the procedure since we do not test on data that was trained on. 

To select the PCA dimension d from among {1, 2, . . . , n — 1} we perform a leave-one-out estimate of error for 
each d and then select the dimension which results in the smallest error. For these estimates we use a threshold 
value of r = 0.5. Finally, we train a final classifier on the whole dataset. This classifier then allows us to compute 
the posterior 77 for any new location. 

To assess this procedure, we again use leave-one out cross validation. This means we have two loops of cross 
validation, an outer loop to train the classifier for each removed sample and compute a posterior and an inner 
loop select the PCA dimension. Additionally, we use receiver operating characteristic (ROC) curves to asses the 
the procedure. The ROC curve is parametrized by the the threshold r and for each value of r we plot the the 
percentage of false positives on the z-axis and the percentage of true positives on the y-axis. Depending on the 
application, one may be concerned with different aspects of the ROC curve. The area under the curve (AUC) is 
a summary statistic that computes the area under the ROC curve. This ranges from 0, reverse classification to 
1, perfect classification. 


4. RESULTS 

We analyzed the above procedure using the 28 band difference ratio feature bands derived from the Worldview-2 
data. For our true sites we consider two different classes, lithic sites (ni = 33) and habitation sites (ni = 16). 
We trained separate classifiers for the lithic and habitation classes. For the class zero training data we selected 
no = 100 locations uniformly at random from the surveyed region of the image swath. We also ensured that the 
selected locations were at least 100 pixels from every known true site. Since the negatives are selected from the 
surveyed regions and are far from known sites, we can safely assume that these sites do not have archaeological 
significance. 

The result of our analysis is a posterior probability for each site derived from classifier trained on the re- 
maining sites. We denote this posterior by f/wv for the WorldView dervied posterior. Independent of the 
WorldView analysis, the archaeological predictive model (APM) provides a score which, when normalized to be 
in {0, 0.2, 0.4, 0.6, 0.8, 1}, can be considered as an alternative posterior estimate which we denote t)apm ■ We 
consider convex combinations of the APM posterior and the Worldview analysis posterior. For each value of 
7 € [0, 1] this gives a new posterior estimate: 

Combined 7 = (1 — y)APM + yWorldView (12) 

f? 7 = (! - 7 )Vapm + Idwv (13) 

Figure 3 shows the AUC value for each 7 £ [0, 1] where 0 represents using only the APM and 1 represents 
using on the WorldView-2 analysis. For the habitation sites, Figure 3a, we see that the APM outperforms the 
WorldView but that a combined classifier with 7 ft: .3 provides even better performance. Noticeably, there is a 
jump in the AUC at 7 = 0. This is due to the fact that the APM gives only 6 possible values so for 7 near zero 
the WorldView analysis acts as a tie breaker for sites that have the same APM score. Similar behavior is see for 
lithic sites, Figure 3b, but in this case the WorldView analysis outperforms the APM and that the a combined 
classifier with 7 ft .65 provides even better performance. 

Figure 4 shows the receiver operating characteristic (ROC) curve of lithic sites and habitation sites corre- 
sponding to five classifiers: APM, Worldview-2, half of each (7 = 0.5), the optimal combined model, and the 
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Figure 4: ROC curves for various choices of 7 . 


tie-breaking model. For different choices of false positive percentage (or true positive percentage) we see that 
the combined model typically improves performance over using either the APM and the WorldView analysis 
alone. The consistent superior performance of the combined model to APM, indicate that adding-in our method 
increases prediction power over using the APM alone. 

4.1 Discussion 

In this paper we consider one particular example of a statistical approach to archaeological site discovery using 
multi-spectral imagery. In particular we consider the framework specified by three components: an image 
processor, feature extractor, and classifier. Specifically, we use a relatively simple image processor in the annuli 
technique and common feature extractors and classifiers in PCA and LDA. 

We demonstrate that our method provides performance which is better than chance and which provides 
additional improved performance when used in conjunction with an archaeological predictive model. It gives 
strong evidence that including our method in archaeological prediction analysis improves classification accuracy. 
We do not claim that these represent optimal results but rather that this framework provides a valuable way to 
attach the problem of archaeological site discovery. Almost certainly there are other methods for each component 
which will give a further improvement in performance. 


For example, the archaeological sites will have some known properties which will help in choosing an image 
processor. For example, it might be desirable to have fb invariant to rotations, changes in scale, or changes in 
lighting. On the other hand, if we are attempting detect sites with a certain aspect, size, or brightness, then 
these invariance properties would be detrimental. There is a rich literature regarding useful image processors for 
remote sensing; 9 however, no image processor is optimal for every task so using prior knowledge about the sites 
and the imagery will be critical. 

We used PCA for feature extraction and applied LDA on the reduced space. There is a wealth of empirical 
evidence that the composition of these two procedures often gives improved results over using LDA on the original 
feature space. One may also use a different feature extractor and predict the data with classifiers such as support 
vector machine, boosting, k-nearest neighbor. The literature on feature extraction and classification is vast and 
much improvement is likely possible within this basic framework. 

In addition to changing the components, we also suspect that including additional feature bands, includ- 
ing especially slope data and possibly data related to aspect and other topographical properties, will improve 
performance. Future work will consider this framework with these additional features. 

5. CONCLUSION 

We have reported here on our latest work in generating maps of probable archaeological sites. This work follows 
up on a previous study that utilized the Student’s t-test. We report here on our work with a new, more flexible 
approach utilizing principle components analysis followed by linear discriminant analysis. In a future study 
we hope to compare the effectiveness and ease of use of the two approaches, as well as how they might most 
effectively be combined. We compared our new approach to an archaeological predictive model (APM), which 
has been considered as the standard means for identifying areas that are likely to contain archaeological sites. 
While the APM had a better performance than our new method for habitation sites, our method outperformed 
the APM for lithic sites and a combination of our method with the APM outperformed either method by itself. 
Further, we have determined that the statistical treatments that we employed in this study can be used to greatly 
improve APMs. By combining improved APMs and the image analysis protocols that we have develop, we are 
optimistic that we can develop even more productive methods of finding archaeological sites in the future. 

We have demonstrated the effectiveness of our new approach over a portion of the Ft. Irwin military reser- 
vation in California. We plan to extend our work to other portions of Ft. Irwin and also to the nearby China 
Lake military reservation, with the ultimate objective of establishing the capacity to conduct rapid, wide-areas 
surveys for archaeological sites with results as reliable, or nearly so, as those obtained from on-ground surveys 
for archaeological sites. We also will be working to gain acceptance of this survey approach by the Advisory 
Council on Historic Preservation and State Historic Preservation Offices as a substitute for such time-consuming 
and more expensive on ground surveys, or to direct on-ground efforts in ways that will great reduce the time 
and expense required for such surveys. 


ACKNOWLEDGMENTS 

Funding for this research was provided by the Department of Defense Legacy Resource Management Program, 
and we would like to thank Cecilia Brothers, Legacy Cultural Resource Management Specialist for that program, 
for her continuing encouragement and support. Funding was also provided through The Johns Hopkins University 
Applied Mathematics and Statistics Department, with which three of our authors are affiliated, and we thank 
many in that department for their interest in this project. We would also like to thank DigitalGlobe, Inc., 
Longmont, CO, USA for contributing the WorldView-2 data used in this study, and in particular Giovanni 
Marcliisio of DigitalGlobe for making the initial offer of this data and facilitating our receiving it. Finally, we 
thank John Brantley Jackson, Installation Archaeologist at Fort Irwin, for his assistance in acquiring LIDAR 
data for Fort Irwin and his support of our research. 



REFERENCES 


[1] Comer, D. C. and Blom, R. G., “Detection and identification of archaeological sites and features using 
synthetic aperture radar (sar) data collected from airborne platforms,” in [Remote Sensing in Archaeology], 
Wiseman, J. R., El-Baz, F., ed., 103-136, Springer Science + Business Media, LLC, New York (2007). 

[2] Comer, D. C. and Blom, R. G., “Remote sensing and archaeology: Tracking the course of human history 
from space,” Earth Imaging Journal March/ April (2007). 

[3] Comer, D. C. and Blom, R., “Wide area inventory of archaeological sites using aerial and satellite data sets: 
Prologue to resource monitoring and preservation,” Proc. 32nd Int. Symp. on Remote Sensing of Environment 
(June 2007). 

[4] Marchisio, G., Padwick, C. and Pacifici, F., “Evidence of improved vegetation discrimination and urban 
mapping using worldview-2 multi-spectral imagery,” ASPRS Annual Conference (May 2010). 

[5] Ruiz, M. O., “The development and testing of an archaeological predictive model,” tech, rep., U.S. Army 
Corps of Engineers Construction Engineering Research Laboratory, Champlain, Illinois (2003). 

[6] Kvamme, K. L., “Development and testing of quantitative models,” in [ Quantifying the Present and Pre- 
dicting the Past: Theory, Method, and Application of Archaeological Predictive Modeling ], Judge, W. J. and 
Sebastian, L., eds., 325-428 (1988). 

[7] Duda, R. O., Hart, P. E., and Stork, D. G., [ Pattern Classification ], Wiley, 2nd ed. (2001). 

[8] Tan, P.-N., Steinbach, M. and Kumar, V., [ Introduction to Data Mining], Addison Wesley (May 2005). 

[9] Schowengerdt, R. A., [Remote Sensing: Models and Methods for Image Processing], 3rd ed. (September 
2006). 



